Skip to content

[Model] Add Index-AniSora I2V support (V1 5B + V2 14B)#877

Open
dorhuri123 wants to merge 2 commits into
vllm-project:mainfrom
dorhuri123:feature/index-anisora
Open

[Model] Add Index-AniSora I2V support (V1 5B + V2 14B)#877
dorhuri123 wants to merge 2 commits into
vllm-project:mainfrom
dorhuri123:feature/index-anisora

Conversation

@dorhuri123
Copy link
Copy Markdown

@dorhuri123 dorhuri123 commented Jan 20, 2026

Summary

This PR adds support for Index-AniSora Image-to-Video models, a family of anime-optimized video generation models developed by Bilibili. Supports both the 5B (CogVideoX-based) and 14B (Wan2.1-based) variants.

Closes #670

Supported Models

Model Architecture VRAM Required HuggingFace
AniSora V1 (5B) CogVideoX ~24GB IndexTeam/AniSora-v1-i2v-diffusers
AniSora V2/V3 (14B) Wan2.1 ~65GB aardsoul-music/Wan2.1-Anisora-14B

Demo Results

AniSora V1 (5B) - RTX 6000

Input Image:

anisora_v1_demo_frame

Generation Settings:

  • Prompt: "A cat playing with yarn"
  • Resolution: 480 × 720
  • Frames: 81 frames @ 16fps
  • Inference steps: 50
  • Guidance scale: 5.0

Output Video (5.06 seconds):

anisora_v1_demo_gh.mp4

AniSora V2 (14B) - Short - NVIDIA H200

Input Image:

panda_input

Generation Settings:

  • Prompt: "a panda eating bamboo, natural lighting, detailed fur"
  • Resolution: 480 × 832
  • Frames: 17 frames @ 8fps
  • Inference steps: 30
  • Guidance scale: 5.0

Output Video (2.1 seconds):

anisora_v2_output_gh.mp4

AniSora V2 (14B) - Long - NVIDIA H200

Input Image:

portrait_input_1

Generation Settings:

  • Prompt: "a woman smiling gently, soft natural lighting, cinematic quality, subtle head movement, flowing hair"
  • Resolution: 480 × 832
  • Frames: 49 frames @ 8fps
  • Inference steps: 30
  • Guidance scale: 5.0

Output Video (6.1 seconds):

anisora_v2_long.mp4

Usage

V1 (5B)

python examples/offline_inference/image_to_video/anisora_image_to_video.py \
  --model IndexTeam/AniSora-v1-i2v-diffusers \
  --image input.png \
  --prompt "anime scene, smooth motion" \
  --height 480 \
  --width 720 \
  --num_frames 81 \
  --guidance_scale 5.0 \
  --num_inference_steps 50 \
  --fps 16 \
  --output anisora_v1.mp4

V2/V3 (14B)

python examples/offline_inference/image_to_video/anisora_v2_image_to_video.py \
  --image input.png \
  --prompt "anime scene, high quality animation" \
  --height 480 \
  --width 832 \
  --num-frames 49 \
  --guidance-scale 5.0 \
  --num-inference-steps 30 \
  --fps 8 \
  --output anisora_v2.mp4

Changes

New Files

  • vllm_omni/diffusion/models/anisora/ - AniSora pipeline module
    • pipeline_anisora_i2v_cogvideox.py - V1 (5B) CogVideoX-based pipeline
    • pipeline_anisora_v2_i2v.py - V2/V3 (14B) Wan2.1-based pipeline with hybrid loading
    • __init__.py - Module exports
  • examples/offline_inference/image_to_video/anisora_image_to_video.py - V1 CLI example
  • examples/offline_inference/image_to_video/anisora_v2_image_to_video.py - V2 CLI example

Modified Files

  • examples/offline_inference/image_to_video/README.md - Added AniSora documentation
  • vllm_omni/diffusion/registry.py - Register AniSora V1/V2 pipelines and their post-/pre-process hooks

Technical Notes

V2 Hybrid Loading

The V2 pipeline uses a hybrid loading approach because community-converted AniSora weights use different config/naming:

  • VAE, T5 text encoder, CLIP image encoder loaded from Wan-AI/Wan2.1-I2V-14B-480P-Diffusers
  • Transformer weights loaded from community AniSora checkpoints
  • Includes comprehensive key name conversion (AniSora → diffusers format)

Key Name Conversions

Community AniSora weights use different naming conventions:

  • self_attnattn1
  • cross_attnattn2
  • ffnff
  • kto_k, qto_q, vto_v, oto_out.0
  • modulationscale_shift_table
  • And additional mappings for full compatibility

Testing

Model GPU Result
V1 (5B) RTX 6000 (~24GB) ✅ Generates valid video with motion
V2 (14B) NVIDIA H200 (~140GB) ✅ Generates valid video with motion

Both pipelines produce output with proper animation.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 81f0eab187

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +56 to +60
def __init__(
self,
*,
model_path: str = "Disty0/Index-anisora-5B-diffusers",
dtype: torch.dtype = torch.bfloat16,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Accept od_config in AniSora pipeline constructor

OmniDiffusion instantiates all registered diffusion models via initialize_model, which always calls model_class(od_config=od_config). This constructor only accepts model_path/dtype/device, so using AniSora through the normal Omni/Diffusers loader path will immediately raise a TypeError for the unexpected od_config kwarg and prevent the model from loading at all.

Useful? React with 👍 / 👎.

Comment on lines +501 to +504
def __call__(
self,
prompt: str | list[str],
image: PIL.Image.Image,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Provide forward(req) entry point for AniSora V2

The diffusion engine executes models via pipeline.forward(req) (with an OmniDiffusionRequest), but this class only defines __call__(prompt, image, ...) and never overrides forward. That means nn.Module.forward will raise NotImplementedError at runtime even if the model loads, so AniSora V2 cannot be run through Omni until a forward wrapper is added.

Useful? React with 👍 / 👎.

@dorhuri123 dorhuri123 force-pushed the feature/index-anisora branch from d1c2809 to 537f736 Compare January 20, 2026 23:56
Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contributions. Amazing work, I will check it these two days.

@lishunyang12
Copy link
Copy Markdown
Collaborator

I saw you introduce new example files. Is it possible to reuse script we already have?

Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lishunyang12
Copy link
Copy Markdown
Collaborator

Fix conflicts, thanks

@dorhuri123 dorhuri123 force-pushed the feature/index-anisora branch from 1c579ed to b1182be Compare January 28, 2026 09:19
@dorhuri123
Copy link
Copy Markdown
Author

dorhuri123 commented Jan 28, 2026

Thanks for the review! I rebased the PR on main, resolved conflicts, and pushed the updated branch.

V2 (Wan2.1) updates:

  • Added a minimal fallback in OmniDiffusion for AniSora V2/V3 repos that don’t ship model_index.json, so _class_name resolves to AniSoraV2I2VPipeline.
  • Fixed CLIP conditioning in the V2 pipeline to prefer pil_image over preprocessed_image (avoids PIL conversion errors during warmup).
  • For large-model runs (e.g., AniSora V2), I used PYTORCH_ALLOC_CONF=expandable_segments:True to reduce allocator fragmentation on big GPUs.

V1 (CogVideoX) updates:

  • Added CogVideoXImageToVideoPipeline alias in the registry for the 5B model.
  • Added pre/post-process hooks for the CogVideoX pipeline.

Shared updates (V1 + V2):

  • Implemented required forward() + load_weights() for vLLM integration.
  • Fixed image preprocessing & device handling and corrected self.dtype usage.
  • Declared support_image_input=True and ensured model weights are moved to device before inference.

Testing:

  • V1 (CogVideoX) ran on RTX 6000 S without setting PYTORCH_ALLOC_CONF.
  • V2 (Wan2.1) ran on H200 (short config) with the allocator env set.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds Index-AniSora image-to-video support to vLLM-Omni, covering both the CogVideoX-based 5B model and the Wan2.1-based 14B models, and wires them into the Omni diffusion registry and offline inference examples.

Changes:

  • Extend OmniDiffusion initialization logic to infer AniSora V2/V3 Wan2.1-based pipelines when model_index.json is missing and only config.json is available.
  • Register new AniSora pipelines (AniSoraI2VCogVideoXPipeline and AniSoraV2I2VPipeline) with corresponding pre-/post-processing hooks and implement their model loading, key-conversion, and I2V sampling logic.
  • Update image-to-video examples and docs to describe AniSora usage and add the AniSora 5B pipeline to the supported models list.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
vllm_omni/entrypoints/omni_diffusion.py Adds a FileNotFoundError guard when model_index.json is absent and introduces a special-case fallback that maps AniSora V2/V3 Wan2.1-based model IDs to the AniSoraV2I2VPipeline.
vllm_omni/diffusion/registry.py Registers AniSoraV2I2VPipeline and AniSoraI2VCogVideoXPipeline with their pre-/post-process hooks, while also removing the FluxPipeline registry entries and the central sequence-parallelism hook.
vllm_omni/diffusion/models/anisora/pipeline_anisora_v2_i2v.py Introduces the Wan2.1-based AniSora V2/V3 I2V pipeline with hybrid loading (Wan2.1 base components + AniSora transformer weights, including key-name conversion, VAE-based conditioning, and FlowUniPC sampling).
vllm_omni/diffusion/models/anisora/pipeline_anisora_i2v_cogvideox.py Adds a CogVideoX-based AniSora 5B I2V pipeline using diffusers’ native CogVideoX components and implements image encoding, 3D rotary embeddings, and DDIM-based denoising.
vllm_omni/diffusion/models/anisora/__init__.py Exposes the two new AniSora pipelines as part of the diffusion models package.
examples/offline_inference/image_to_video/image_to_video.py Generalizes the example script description/usage to include AniSora 5B and 14B models alongside existing Wan2.2 I2V/TI2V models.
examples/offline_inference/image_to_video/README.md Expands the image-to-video README with dedicated AniSora V1/V2 sections and example commands, plus reorganized Wan2.2 usage notes.
docs/models/supported_models.md Updates the supported-models table to include AniSoraI2VCogVideoXPipeline and remove some previous rows (e.g., Flux, certain TTS entries).
docs/.nav.yml Adds navigation entries for LoRA inference examples and several Omni connector design docs in the user guide and design sections.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread vllm_omni/diffusion/models/anisora/pipeline_anisora_i2v_cogvideox.py Outdated
Comment thread vllm_omni/diffusion/models/anisora/pipeline_anisora_v2_i2v.py Outdated
Comment thread vllm_omni/diffusion/registry.py Outdated
Comment thread vllm_omni/diffusion/models/anisora/pipeline_anisora_v2_i2v.py Outdated
Comment thread vllm_omni/diffusion/models/anisora/pipeline_anisora_v2_i2v.py Outdated
Comment thread vllm_omni/diffusion/models/anisora/pipeline_anisora_i2v_cogvideox.py Outdated
Comment thread docs/models/supported_models.md Outdated
Comment on lines +549 to +595


# Simple test
if __name__ == "__main__":
import urllib.request

print("Testing AniSora I2V CogVideoX Pipeline...")

# Create pipeline
pipeline = AniSoraI2VCogVideoXPipeline(
model_path="Disty0/Index-anisora-5B-diffusers",
dtype=torch.bfloat16,
)
pipeline.to("cuda")

# Download test image
url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png"
urllib.request.urlretrieve(url, "/tmp/cat.png")
image = PIL.Image.open("/tmp/cat.png").convert("RGB")

# Generate
output = pipeline(
prompt="a cat walking in the garden, high quality",
image=image,
negative_prompt="low quality, blurry",
num_inference_steps=10,
height=480,
width=832,
num_frames=17,
)

print(f"Output type: {type(output)}")
print(f"Output.output shape: {output.output.shape}")

# Check for NaN
if torch.isnan(output.output).any():
print("WARNING: Output contains NaN!")
else:
print("Output looks valid (no NaN)")

# Save video
from diffusers.utils import export_to_video

video = output.output[0].permute(1, 2, 3, 0).cpu().numpy() # [C, F, H, W] -> [F, H, W, C]
video = ((video + 1) / 2 * 255).clip(0, 255).astype("uint8")
export_to_video(video, "/workspace/test_cogvideox.mp4", fps=16)
print("Video saved to /workspace/test_cogvideox.mp4")
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if __name__ == "__main__" block instantiates AniSoraI2VCogVideoXPipeline(model_path=..., dtype=...), but the class __init__ only takes od_config (plus keyword-only) and doesn’t accept these arguments. This makes the in-file test unusable and may mislead users about how to construct the pipeline; it should either be removed or rewritten to go through OmniDiffusionConfig / the Omni entrypoint.

Suggested change
# Simple test
if __name__ == "__main__":
import urllib.request
print("Testing AniSora I2V CogVideoX Pipeline...")
# Create pipeline
pipeline = AniSoraI2VCogVideoXPipeline(
model_path="Disty0/Index-anisora-5B-diffusers",
dtype=torch.bfloat16,
)
pipeline.to("cuda")
# Download test image
url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png"
urllib.request.urlretrieve(url, "/tmp/cat.png")
image = PIL.Image.open("/tmp/cat.png").convert("RGB")
# Generate
output = pipeline(
prompt="a cat walking in the garden, high quality",
image=image,
negative_prompt="low quality, blurry",
num_inference_steps=10,
height=480,
width=832,
num_frames=17,
)
print(f"Output type: {type(output)}")
print(f"Output.output shape: {output.output.shape}")
# Check for NaN
if torch.isnan(output.output).any():
print("WARNING: Output contains NaN!")
else:
print("Output looks valid (no NaN)")
# Save video
from diffusers.utils import export_to_video
video = output.output[0].permute(1, 2, 3, 0).cpu().numpy() # [C, F, H, W] -> [F, H, W, C]
video = ((video + 1) / 2 * 255).clip(0, 255).astype("uint8")
export_to_video(video, "/workspace/test_cogvideox.mp4", fps=16)
print("Video saved to /workspace/test_cogvideox.mp4")

Copilot uses AI. Check for mistakes.
Comment thread examples/offline_inference/image_to_video/README.md Outdated
Comment thread examples/offline_inference/image_to_video/README.md Outdated
@ZJY0516
Copy link
Copy Markdown
Member

ZJY0516 commented Feb 4, 2026

@dorhuri123 It seems that the first video has accuracy problems

@dorhuri123
Copy link
Copy Markdown
Author

dorhuri123 commented Feb 4, 2026

@ZJY0516 agreed — the first output looks clearly off (strong color inversion / desaturation compared to the input). I re‑ran the exact same settings and got a cleaner output on my side(after all the changes that were done to suite the existing example file).

Could you point to the specific behavior you want to treat as the “accuracy issue” (e.g., color inversion, identity drift, motion artifacts)? That would help me isolate whether it’s still a code path issue or just variability.

Same input/settings for both runs:

attempt 1

anisora_v1_demo.mp4

attempt 2

anisora_v1_demo.1.mp4

@ZJY0516
Copy link
Copy Markdown
Member

ZJY0516 commented Feb 4, 2026

The shape and movement of the cat in the video don’t look quite right. Could you compare this with the official implementation to verify? @dorhuri123

@dorhuri123
Copy link
Copy Markdown
Author

@ZJY0516 I compared against the official Diffusers CogVideoXImageToVideoPipeline using the same input/settings on an RTX 6000 (Blackwell Server Edition). The output shows the same shape/motion characteristics as the vLLM run, so it looks like this is model behavior rather than an integration issue.

Baseline (official diffusers) commands + script used:

# env
python -m venv ~/anisora-diffusers
source ~/anisora-diffusers/bin/activate
pip install --upgrade pip
pip install diffusers==0.36.0 transformers accelerate safetensors huggingface_hub \
            sentencepiece tiktoken protobuf imageio imageio-ffmpeg
pip install --pre --upgrade torch --index-url https://download.pytorch.org/whl/nightly/cu128

# input image
wget -O /tmp/cat.png https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png
# run_diffusers_anisora_v1.py
import torch
import PIL.Image
from diffusers import CogVideoXImageToVideoPipeline
from diffusers.utils import export_to_video

pipe = CogVideoXImageToVideoPipeline.from_pretrained(
    "Disty0/Index-anisora-5B-diffusers",
    torch_dtype=torch.bfloat16,
).to("cuda")

image = PIL.Image.open("/tmp/cat.png").convert("RGB")

video = pipe(
    prompt="A cat playing with yarn",
    image=image,
    height=480,
    width=720,
    num_frames=81,
    num_inference_steps=50,
    guidance_scale=5.0,
    output_type="np",
).frames[0]

export_to_video(video, "anisora_v1_diffusers.mp4", fps=16)

I’ll attach the diffusers output video in this comment. If you’re seeing a specific artifact you want addressed, let me know the exact behavior and I’ll dig deeper.

anisora_v1_diffusers.mp4

@hsliuustc0106
Copy link
Copy Markdown
Collaborator

resolve conflicts please

@hsliuustc0106
Copy link
Copy Markdown
Collaborator

@ZJY0516 I compared against the official Diffusers CogVideoXImageToVideoPipeline using the same input/settings on an RTX 6000 (Blackwell Server Edition). The output shows the same shape/motion characteristics as the vLLM run, so it looks like this is model behavior rather than an integration issue.

Baseline (official diffusers) commands + script used:

# env
python -m venv ~/anisora-diffusers
source ~/anisora-diffusers/bin/activate
pip install --upgrade pip
pip install diffusers==0.36.0 transformers accelerate safetensors huggingface_hub \
            sentencepiece tiktoken protobuf imageio imageio-ffmpeg
pip install --pre --upgrade torch --index-url https://download.pytorch.org/whl/nightly/cu128

# input image
wget -O /tmp/cat.png https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png
# run_diffusers_anisora_v1.py
import torch
import PIL.Image
from diffusers import CogVideoXImageToVideoPipeline
from diffusers.utils import export_to_video

pipe = CogVideoXImageToVideoPipeline.from_pretrained(
    "Disty0/Index-anisora-5B-diffusers",
    torch_dtype=torch.bfloat16,
).to("cuda")

image = PIL.Image.open("/tmp/cat.png").convert("RGB")

video = pipe(
    prompt="A cat playing with yarn",
    image=image,
    height=480,
    width=720,
    num_frames=81,
    num_inference_steps=50,
    guidance_scale=5.0,
    output_type="np",
).frames[0]

export_to_video(video, "anisora_v1_diffusers.mp4", fps=16)

I’ll attach the diffusers output video in this comment. If you’re seeing a specific artifact you want addressed, let me know the exact behavior and I’ll dig deeper.

anisora_v1_diffusers.mp4

which transformers version you are using?

@hsliuustc0106
Copy link
Copy Markdown
Collaborator

did you keep the seed as the same for comparision with diffusers?

@dorhuri123
Copy link
Copy Markdown
Author

dorhuri123 commented Feb 11, 2026

@ZJY0516 I compared against the official Diffusers CogVideoXImageToVideoPipeline using the same input/settings on an RTX 6000 (Blackwell Server Edition). The output shows the same shape/motion characteristics as the vLLM run, so it looks like this is model behavior rather than an integration issue.
Baseline (official diffusers) commands + script used:

# env
python -m venv ~/anisora-diffusers
source ~/anisora-diffusers/bin/activate
pip install --upgrade pip
pip install diffusers==0.36.0 transformers accelerate safetensors huggingface_hub \
            sentencepiece tiktoken protobuf imageio imageio-ffmpeg
pip install --pre --upgrade torch --index-url https://download.pytorch.org/whl/nightly/cu128

# input image
wget -O /tmp/cat.png https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png
# run_diffusers_anisora_v1.py
import torch
import PIL.Image
from diffusers import CogVideoXImageToVideoPipeline
from diffusers.utils import export_to_video

pipe = CogVideoXImageToVideoPipeline.from_pretrained(
    "Disty0/Index-anisora-5B-diffusers",
    torch_dtype=torch.bfloat16,
).to("cuda")

image = PIL.Image.open("/tmp/cat.png").convert("RGB")

video = pipe(
    prompt="A cat playing with yarn",
    image=image,
    height=480,
    width=720,
    num_frames=81,
    num_inference_steps=50,
    guidance_scale=5.0,
    output_type="np",
).frames[0]

export_to_video(video, "anisora_v1_diffusers.mp4", fps=16)

I’ll attach the diffusers output video in this comment. If you’re seeing a specific artifact you want addressed, let me know the exact behavior and I’ll dig deeper.
anisora_v1_diffusers.mp4

which transformers version you are using?

I didn't pin the transformers version in that comparison — I ran pip install transformers which installed the latest version compatible with diffusers 0.36.0. I don't have that environment anymore so I can't check the exact version. I can re-create the environment and report back with the exact version if needed.

@dorhuri123
Copy link
Copy Markdown
Author

dorhuri123 commented Feb 11, 2026

did you keep the seed as the same for comparision with diffusers?

Good catch — I didn't set a fixed seed in the diffusers baseline script. The comparison was qualitative, showing that the same motion/shape characteristics appear in both implementations. I can re-run with a fixed seed in both (generator=torch.Generator("cuda").manual_seed(42)) if needed.

Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on this — left a few thoughts inline, mostly around some small things I noticed.

Comment thread docs/models/supported_models.md Outdated
|`StableDiffusion3Pipeline` | Stable-Diffusion-3 | `stabilityai/stable-diffusion-3.5-medium` |
|`Flux2KleinPipeline` | FLUX.2-klein | `black-forest-labs/FLUX.2-klein-4B`, `black-forest-labs/FLUX.2-klein-9B` |
|`FluxPipeline` | FLUX.1-dev | `black-forest-labs/FLUX.1-dev` |
|`StableAudioPipeline` | Stable-Audio-Open | `stabilityai/stable-audio-open-1.0` |
Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this diff might have accidentally deleted the FluxPipeline and Qwen3TTSForConditionalGeneration rows — probably a rebase artifact? Also, would it make sense to add the AniSora V2 (14B) entry to the table too?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — this was indeed a rebase artifact. I've synced the GPU table with upstream main (restored FluxPipeline and the three Qwen3TTSForConditionalGeneration rows) and added the AniSora V2 (14B) entry as well.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks for cleaning that up.

"pipeline_anisora_i2v_cogvideox",
"AniSoraI2VCogVideoXPipeline",
),
}
Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just something I was wondering about — registering CogVideoXImageToVideoPipeline as the key means any model declaring that class name would get routed here, including vanilla CogVideoX I2V models. Would a more specific key work better, or is there a reason for keeping it generic?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right — using the generic diffusers class name would hijack vanilla CogVideoX models. I've renamed the registry key to AniSoraI2VCogVideoXPipeline and added a targeted mapping in omni_diffusion.py that only converts CogVideoXImageToVideoPipeline → AniSoraI2VCogVideoXPipeline when "anisora" appears in the model name. This way vanilla CogVideoX models are unaffected.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better — scoping by model name avoids the hijacking issue.

# Load weights from AniSora
logger.info("Downloading AniSora weights...")
import glob
import os as os_module
Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit — os is already imported at the top of the file (line 21), so the import os as os_module here shadows it a bit. Not a big deal though.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — removed the redundant import and switched both usages to the top-level os.


# Load state dict
missing, unexpected = self.transformer.load_state_dict(converted_state_dict, strict=False)
if missing:
Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed missing keys are only logged at debug level, which is off by default. Since a wrong key mapping could be tricky to debug, would it help to log at warning level or add a threshold check? Just a thought.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — a broken key mapping would be very hard to diagnose with debug-level logging. Changed both missing and unexpected keys to warning level, and removed the 10-key threshold so all keys are always logged. This way any mismatch is immediately visible.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.


# Classifier-free guidance
if do_classifier_free_guidance:
noise_uncond = self.transformer(
Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed CFG here runs the transformer twice per step instead of batching conditional and unconditional together. The V1 pipeline does batch them with torch.cat. Is there a specific reason V2 does it differently, or could it use the same approach?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No specific reason — this was an oversight. I've refactored V2 to batch conditional and unconditional inputs with torch.cat in a single forward pass, matching V1's approach. This halves the number of transformer calls per denoising step.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, batching should cut the per-step cost significantly.

if isinstance(first_prompt, dict):
additional_info = first_prompt.get("additional_information", {})
if isinstance(additional_info, dict) and isinstance(
additional_info.get("preprocessed_image"), PIL.Image.Image
Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be misreading this, but video_processor.preprocess() returns a torch.Tensor, so the isinstance(..., PIL.Image.Image) check would always be False, making this branch unreachable. Is the intent to always go through multi_modal_data instead?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right — preprocessed_image is always a tensor from VideoProcessor.preprocess(), so the PIL check was dead code. I've simplified the logic to go directly to multi_modal_data["image"] which holds the PIL image needed for CLIP conditioning.

logger.info("Encoding prompts...")
prompt_embeds, negative_prompt_embeds = self.encode_prompt(prompt, negative_prompt)

do_classifier_free_guidance = guidance_scale > 1.0 and negative_prompt_embeds is not None
Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to make sure — guidance_scale seems to only take effect when negative_prompt_embeds is provided, but I don't see a default negative prompt being set. Is that intentional, or should there be an empty-string default when guidance_scale > 1.0?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not intentional — without a default, CFG was silently becoming a no-op when no negative prompt was provided. I've added a default of "" (empty string) in both V1 and V2 when guidance_scale > 1.0 and no negative prompt is given. This matches the behavior of diffusers' official pipelines.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good fix — matching diffusers' default behavior seems right.


# Default paths for components
DEFAULT_WAN_BASE = "Wan-AI/Wan2.1-I2V-14B-480P-Diffusers"
DEFAULT_ANISORA_TRANSFORMER = "aardsoul-music/Wan2.1-Anisora-14B"
Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small thing — DEFAULT_ANISORA_TRANSFORMER doesn't seem to be used anywhere. Is it planned for future use, or can it be removed?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No plans for it — the transformer path always comes from od_config.model. Removed.

Comment thread vllm_omni/entrypoints/omni_diffusion.py Outdated
model_id = (od_config.model or "").lower()
if (
od_config.model_class_name is None
and "anisora" in model_id
Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "anisora" in model_id check might be a bit fragile — someone with a path like /data/anisora_experiment/some-other-model could accidentally match. Would a config-based check be more reliable? Just a thought.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I've changed both AniSora detection paths to check os.path.basename() of the model path/ID instead of the full string, so only the actual model name is matched. A fully config-based approach would require changes upstream (e.g., a field in config.json), so basename matching is the best we can do for now since these community repos don't ship model_index.json.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, basename matching sounds like a reasonable middle ground for now.


class AniSoraI2VCogVideoXPipeline(nn.Module):
# vLLM uses this flag to decide whether to feed dummy images in warmup
support_image_input = True
Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor thing — the class docstring is after support_image_input = True, so Python would attach it to that attribute rather than the class. Might want to move it up?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — moved the docstring to the first statement in the class body (before support_image_input) in both V1 and V2 pipelines.

@hsliuustc0106
Copy link
Copy Markdown
Collaborator

@vllm-omni-reviewer

@github-actions
Copy link
Copy Markdown

🤖 VLLM-Omni PR Review

Code Review: Add Index-AniSora I2V Support

1. Overview

This PR adds support for Index-AniSora Image-to-Video models, supporting both the 5B (CogVideoX-based) and 14B (Wan2.1-based) variants. The implementation includes:

  • Two new pipeline modules with hybrid loading for V2
  • Registry and entrypoint integration
  • Documentation and example updates

Overall Assessment: LGTM with suggestions - The implementation is well-structured and follows existing patterns, but has several issues that should be addressed before merging.


2. Code Quality

Positive Aspects

  • Well-documented code with clear docstrings and comments
  • Good logging throughout for debugging
  • Clean separation between V1 and V2 architectures
  • Proper type hints used consistently

Issues Found

Critical: Incorrect ValueError usage

vllm_omni/diffusion/models/anisora/pipeline_anisora_i2v_cogvideox.py:79-82

raise ValueError(
    """No image is provided. This model requires an image to run.""",
    """Please correctly set `"multi_modal_data": {"image": <an image object or file path>, …}`""",
)

This raises ValueError with two arguments, which creates a tuple exception message. Same issue at lines 85-88 and in pipeline_anisora_v2_i2v.py:77-84.

Fix:

raise ValueError(
    "No image is provided. This model requires an image to run. "
    "Please correctly set `multi_modal_data: {image: <an image object or file path>, …}`"
)

Potential Bug: Duplicate weight loading

vllm_omni/diffusion/models/anisora/pipeline_anisora_v2_i2v.py:248-252

def load_weights(self, weights: Iterable[tuple[str, torch.Tensor]]) -> set[str]:
    """Load weights using AutoWeightsLoader for vLLM integration."""
    loader = AutoWeightsLoader(self)
    return loader.load_weights(weights)

The V2 pipeline loads weights manually in __init__ (lines 196-243), but also provides load_weights. If vLLM calls load_weights after initialization, weights could be loaded twice or incorrectly.

Suggestion: Either remove load_weights for V2 or make __init__ not load weights and rely on vLLM's weight loading mechanism.


3. Architecture & Design

Positive Aspects

  • Hybrid loading approach for V2 is well-documented and necessary for community weight compatibility
  • Pre/post-process functions follow existing patterns in the codebase
  • Key name conversion logic is comprehensive and clearly documented

Concerns

Fragile model detection logic

vllm_omni/entrypoints/omni_diffusion.py:71-77

if (
    class_name == "CogVideoXImageToVideoPipeline"
    and "anisora" in os.path.basename((od_config.model or "").rstrip("/")).lower()
):
    class_name = "AniSoraI2VCogVideoXPipeline"

This string matching could match unintended models (e.g., /models/anisora_experiment/other-model). Consider:

  1. Adding a config file check for AniSora-specific markers
  2. Using a more specific pattern match
  3. Documenting the naming convention requirement

Forced download in V2 pipeline

vllm_omni/diffusion/models/anisora/pipeline_anisora_v2_i2v.py:196-200

if local_anisora:
    weight_path = model_path
else:
    weight_path = snapshot_download(model_path, local_files_only=False)

The local_files_only=False forces network access even when files might be cached. Should respect offline mode:

weight_path = snapshot_download(model_path, local_files_only=local_anisora)

Missing offline mode support for Wan base

vllm_omni/diffusion/models/anisora/pipeline_anisora_v2_i2v.py:167-175

The Wan2.1 base components are loaded with local_files_only=local_wan, but local_wan is determined by checking if the default path exists locally, which will almost always be False for the default HuggingFace ID.


4. Security & Safety

Input Validation

Missing path validation

vllm_omni/diffusion/models/anisora/pipeline_anisora_v2_i2v.py:201-206

safetensor_files = glob.glob(os.path.join(weight_path, "*.safetensors"))
if not safetensor_files:
    safetensor_files = glob.glob(os.path.join(weight_path, "**/*.safetensors"), recursive=True)

The glob patterns could potentially match unintended files. Consider:

  1. Validating that files are within the expected directory
  2. Filtering for specific expected file patterns

Silent failure for missing CLIP encoder

vllm_omni/diffusion/models/anisora/pipeline_anisora_v2_i2v.py:177-184

except Exception as e:
    logger.warning("CLIP image encoder not available: %s", e)
    self.image_processor = None
    self.image_encoder = None
    self.has_image_encoder = False

Catching broad Exception and continuing could mask real issues. The I2V pipeline may produce degraded results without CLIP conditioning. Consider:

  1. Making CLIP encoder required for I2V models
  2. Or raising a clear warning that output quality may be affected

5. Testing & Documentation

Documentation

  • ✅ README updated with new models
  • ✅ Usage examples provided
  • ✅ Demo results with specific settings
  • ✅ Technical notes explain hybrid loading approach

Missing

  • ❌ No unit tests for the new pipelines
  • ❌ No validation tests for key conversion logic
  • ❌ No tests for model detection logic in entrypoint

Suggested Test Cases

  1. Key conversion correctness (compare converted keys against expected mapping)
  2. Model detection logic for various model name patterns
  3. Error handling for missing images
  4. Dimension validation for various input sizes

6. Specific Suggestions

pipeline_anisora_i2v_cogvideox.py

Line Issue Suggestion
79-82 ValueError with multiple args Combine into single string
85-88 ValueError with multiple args Combine into single string
368-370 load_weights may conflict with init loading Document that this is for vLLM internal use only

pipeline_anisora_v2_i2v.py

Line Issue Suggestion
77-84 ValueError with multiple args Combine into single string
167-175 Offline mode not properly supported Add local_files_only parameter to config
196-200 Forced network access Use local_files_only=local_anisora
248-252 Duplicate weight loading risk Remove or document behavior clearly

omni_diffusion.py

Line Issue Suggestion
71-77 Fragile string matching Add additional validation or document naming requirements
88-97 Fragile string matching Same as above

registry.py

Line Issue Suggestion
289-290 Post-process func names Consider adding docstrings explaining the func signatures

7. Approval Status

LGTM with suggestions

The PR is well-structured and follows existing patterns in the codebase. The hybrid loading approach for V2 is necessary and well-implemented. However, the following should be addressed:

Required before merge:

  1. Fix the ValueError multi-argument issue (affects error messages displayed to users)

Recommended:
2. Fix the local_files_only=False forced download in V2 pipeline
3. Add basic unit tests for key conversion logic
4. Document the model naming convention requirement for auto-detection

Optional improvements:
5. Consider making CLIP encoder required for V2 I2V
6. Add validation for the load_weights vs __init__ weight loading in V2


This review was generated automatically by the VLLM-Omni PR Reviewer Bot
using glm-5.

Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All previous comments addressed — the CFG batching, registry rename, logging, and docstring fixes all look good. The only remaining heuristic is the basename check in omni_diffusion.py, but that's a reasonable approach for now. LGTM.

@dorhuri123
Copy link
Copy Markdown
Author

Thanks for the thorough review and the LGTM! I really appreciate you taking the time to go through everything.

The @vllm-omni-reviewer bot flagged a few additional items — most were false positives or already covered, but it did catch a real bug: our ValueError calls were passing two string arguments (creating a tuple message) instead of a single concatenated string. Just pushed a fix for that in 57e3eb0.

All feedback addressed — ready for merge whenever you're comfortable!

@hsliuustc0106 hsliuustc0106 added the ready label to trigger buildkite CI label Feb 23, 2026
@hsliuustc0106
Copy link
Copy Markdown
Collaborator

@wtomin PTAL

@dorhuri123 dorhuri123 force-pushed the feature/index-anisora branch from beaf8af to 7e9fd7d Compare March 3, 2026 12:55
@dorhuri123
Copy link
Copy Markdown
Author

dorhuri123 commented Mar 19, 2026

@wtomin

PR Update: Benchmarks, E2E Tests, SP Support & V2 TP Bug Report

What this PR adds

  • V1 (5B, CogVideoX-based) I2V pipeline with Ulysses sequence parallelism (SP), tensor parallelism (TP), and FP8 quantization
  • V2 (14B, Wan2.1-based) I2V pipeline with AniSora→diffusers weight key conversion and TP support
  • Benchmark script (benchmark_anisora.py) with --warmup, --tp, --sp, --quantization flags
  • E2E offline tests: single GPU, TP=2, SP=2, FP8, V2 TP=2
  • E2E online serving tests: full job lifecycle (create → poll → download → delete)

Benchmark Results (2× H100 80GB)

Model Backend VRAM (GiB) Latency (s)
V1 (5B) diffusers 34.46 107.88
V1 (5B) vllm-omni (TP=1) 76.86 200.75
V1 (5B) vllm-omni (TP=2) 122.05 149.25
V1 (5B) vllm-omni (SP=2) 135.36 131.77
V1 (5B) vllm-omni (FP8) 65.95 200.92
V2 (14B) vllm-omni (TP=2) 82.44 274.22
  • SP=2 gives ~34% speedup vs TP=1 (131.77s vs 200.75s)
  • FP8 reduces VRAM by ~14% (76.86 → 65.95 GiB) with no latency impact

E2E Test Results

Offline inference — 5/5 passed (159s):

test_anisora_v1_offline_single_gpu  PASSED
test_anisora_v1_offline_tp2         PASSED
test_anisora_v2_offline_tp2         PASSED
test_anisora_v1_offline_sp2         PASSED
test_anisora_v1_offline_fp8         PASSED

Online serving — 2/2 passed (116s):

test_anisora_v1_online_create_poll_download_delete  PASSED
test_anisora_v2_online_create_poll_download_delete  PASSED

Known Issue: V2 (Wan2.1 14B) Quality Degradation with TP=2

We discovered a pre-existing bug in the Wan2.1 transformer's TP=2 weight sharding that causes severe mosaic artifacts. This is NOT introduced by this PR — it exists in the base Wan2.1 TP implementation. All 1143/1143 transformer weights load correctly (verified with diagnostics).

V2 with TP=1 (correct output):

v2_sample.1.mp4

V2 with TP=2 (mosaic artifacts):

v2_sample.mp4

This should be tracked as a separate issue for the Wan2.1 transformer TP implementation.

@@ -0,0 +1,197 @@
# SPDX-License-Identifier: Apache-2.0
Copy link
Copy Markdown
Collaborator

@wtomin wtomin Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take this RFC #1832 as reference of your online serving test script. You can test TP only now.

Copy link
Copy Markdown
Author

@dorhuri123 dorhuri123 Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — added test_anisora_v1_online_tp2_create_poll_download_delete (full job lifecycle with --tensor-parallel-size 2), following your guidance from #1832.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming of this file has minor mismatch. Please check #1682 as a reference.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In test-nightly-diffusion.yaml, the online serving tests are launched by:

pytest -s -v tests/e2e/online_serving/test_*_expansion.py

Therefore, I recommend you to rename this test script to test_anisora_expansion.py.

Afterwards, I can add a nightly-test label, and launch a buildkite test with this model's online serving test.

Comment thread examples/offline_inference/image_to_video/benchmark_anisora.py Outdated
@wtomin
Copy link
Copy Markdown
Collaborator

wtomin commented Mar 23, 2026

An existing issue related to tp accuracy problem #1713. Please check if it is the same problem.

Besides, since it supports FP8, please update docs/user_guide/diffusion_acceleration.md

@Gaohan123 Gaohan123 removed this from the v0.18.0 milestone Mar 23, 2026
@dorhuri123 dorhuri123 force-pushed the feature/index-anisora branch from 8402714 to 241d9c6 Compare March 26, 2026 08:41
@dorhuri123
Copy link
Copy Markdown
Author

@wtomin
Confirmed — same root cause as #1713: unsynchronized RNG across ranks. Without a fixed seed, each rank initializes from independent noise, causing mosaic artifacts. For TP=2 the starting latents diverge across ranks; for SP=2 it's worse — the noise tensor is spatially split across ranks, so independent RNG creates a hard discontinuity at the split boundary that propagates through every denoising step.

Our offline tests pass seed=42 explicitly and all pass correctly. The online path already accepts a seed field in the request payload, which avoids the issue when set. A proper fix would be a sensible default seed in OmniDiffusionSamplingParams as suggested in #1713.

Also updated docs/user_guide/diffusion_acceleration.md: added AniSora V1 to the VideoGen (Ulysses-SP ✅) and Quantization (FP8 ✅) tables.

# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
"""
E2E offline inference tests for Index-AniSora I2V models.

Copy link
Copy Markdown
Collaborator

@wtomin wtomin Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To check the functionality, we prioritize online serving test script over offline inference script. If you test cases are overlapped in the two test scripts, I recommend you to maintain the test case (e.g., tp=2) in online serving test script, and you can delete the test case in offline inference test script. This prevents duplicated test cases.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed test_anisora_v1_offline_tp2 and test_anisora_v2_offline_tp2 from the offline test file. TP=2 lifecycle coverage is now maintained only in test_anisora_online.py via test_anisora_v1_online_tp2_create_poll_download_delete as recommended.

@wtomin
Copy link
Copy Markdown
Collaborator

wtomin commented Mar 31, 2026

Please resolve the conflicts.

@dorhuri123 dorhuri123 force-pushed the feature/index-anisora branch 2 times, most recently from 95f995e to aeace8e Compare April 8, 2026 23:59
@dorhuri123
Copy link
Copy Markdown
Author

Conflicts resolved and rebased on latest main. Picked up upstream's doc restructure (acceleration docs merged into diffusion_features.md) and added AniSora V1/V2 to the new VideoGen feature table there.

@dorhuri123 dorhuri123 force-pushed the feature/index-anisora branch from aeace8e to 1edfb06 Compare April 20, 2026 21:40
@dorhuri123
Copy link
Copy Markdown
Author

@wtomin Rebased on latest main and resolved the conflicts (updated the new diffusion_features.md with AniSora V1/V2 rows and kept the registry merged). Could you take another look when you get a chance? Thanks!

@dorhuri123 dorhuri123 force-pushed the feature/index-anisora branch from 1edfb06 to 768ad3a Compare April 28, 2026 16:38
@wtomin
Copy link
Copy Markdown
Collaborator

wtomin commented May 14, 2026

@dorhuri123 Sorry for the delay. Could you rebase to the latest main? I will try to merge this PR recently.

…nchmarks

- Add AniSora V1 (5B, CogVideoX-based) I2V pipeline with Ulysses SP, TP,
  and FP8 quantization support
- Add AniSora V2 (14B, Wan2.1-based) I2V pipeline with AniSora→diffusers
  weight key conversion and TP support
- Register both pipelines and their pre/post-process hooks in the diffusion
  registry; route via OmniDiffusion entrypoint
- Add e2e offline tests (single GPU, SP=2, FP8) and online serving tests
  (V1 single GPU, V1 TP=2, V2) covering full job lifecycle
- Add AniSora rows to `docs/models/supported_models.md` and the new
  VideoGen feature table in `docs/user_guide/diffusion_features.md`

Signed-off-by: Dor Huri <Dorhuri123@gmail.com>
@dorhuri123
Copy link
Copy Markdown
Author

@wtomin Thanks! Rebased onto latest main and resolved the conflicts, now a single clean commit, ready to merge.

Comment thread tests/e2e/offline_inference/test_anisora_i2v.py Outdated
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new model add new model ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[New Model]: Index-AniSora (Bilibili)

8 participants